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mixture prepared by mixing the specimens of group A 
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with the group B of specimens and detecting the inter- 
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the group A of specimens according to the law based on 
the binary notation serves to remarkably reduce the fre- 
quency of the reactions required for detecting the inter- 
action and permits rapid screening. 
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Description 

Technical Field 

5 [0001 ] The present invention relates to a method for efficiently screening an enormous number of test samples. More 
specifically, it relates to a method for efficiently detecting the congelation between test samples in groups A and B when 
they have material interactions between them. For example, the method of the present invention is especially useful for 
genomic structural analysis, 

10 Background Art 

[0002] Research on genomic structural analysis these days can be classified as "genome mapping" and "sequenc- 
ing." In genome mapping, the chromosomal DNA structure is reconstructed by mapping the genome using various 
techniques and by aligning many fragmented genomic DMAs. In sequencing, the nucleotide sequence of genomic DNA 
IS is clarified by determining nucleotide sequences of aligned DNA fragments. TTie present invention is especially useful 

, for genome mapping. 

[0003] Previously, when test samples in groups A and B materially interacted, detecting their con-elation required 
examining whether or not each test sample successively withdrawn one-by-one from group A corresponded to each test 
sample in group B in the interaction. Therefore, if group A consists of m samples and Group B consists of n test sam- 
20 pies, (m) X (n) screenings were required. 

[0004] An example of such a conventional screening method is the method for correlating Sequence Tagged Site 
(STS) markers (group A) with Bacterial Artificial Chromosome (BAG) clones (group B). STS is a concept for systemat- 
ically marking the hunnan genome (Olson, M. et at, Science 245: 1434-1435, 1989). STSs consist of short nucleotide 
sequences of about 200 to 300 bp and possess a sequence which cannot be found in other sites of the genome. 
25 Accordingly, the same STS contained in multiple clones indicates that these clones share common regions. By perfonn- 

^ ing PGR using primers designed for STS with genomic DNA as the template, the amplified product of a length corre- 

sponding to the STS can be confirmed as a single band (S. B. Primrose. "Principles of Genome Analysis" Blackwell 
Science Ltd.. 1 995). In the combination of STS markers (group A) and BAG clones (group B), a BAG ctone con-espond- 
ing to an STS marker is used to be detected by a method based on PGR screening or hybridization screening STS 
30 mariners one-by-one and BAG clones. 

[0005] Physical mapping using STS has been used in a limited field by many researchers. The region of the causative 
gene for cystic fibrosis covered by 30 YAG clones in chromosome 7 has been integrated into a single aligned clone of 
more than 1.5 Mb (Green & Olson. Science 250: 94-98. 1990). Foote et ai. succeeded in aligning 196 clones covering 
more than 98% of the euchromatin region of the human Y chromosome (Foote et al.. Science 258: 60-66, 1992), A 
35 YAC-STS integrated map. a combination of physical map with genetic map. on the long arm (q) of chromosome 21 has 
been prepared (Ghumakov et al, Nature 359, 380-387, 1991). 

[0006] These days, a combinatfon of BAG libraries or PAG (PI -derived artificial chromosome) libraries with STS mark- 
ers enables covering most human genome. However, a method for efficiently detecting numerous correspondences of 
STSs to such large-scale libraries has not been established. 

40 [0007] In these conventional methods, the increasing number of test samples in groups A and B results in a geomet- 
rically progressive increase in the number of screenings, requiring enormous amounts of time and labor. 
[0008] For example, in genome mapping in general, screening of DNA libraries usually requires numerous r^etitions 
of filter hybridizations or a series of PGR assays of prepared for each library(Asakawa et at, Gene 191 : 69-79, 1997). 
Therefore, to align library clones covering the entire human genome, numerous combinations of DNA libraries and 

45 probes must be screened. 

[0009] The utilization of DNA chips for genomic analysis is highly expected to facilitate more speedy screening. Since 
oligonucleotides of desired nucleotide sequences can be cumulated in high density on DNA chips, the hybridization 
assay can be carried out for numerous combinations by a single hybridization. In fact, the mapping of 256 varieties of 
STS markers against yeast cosmid done using DNA chips has been reported (Sapolaky, R. J. et al. Genomics 33, 445- 

50 456, 1 996). However, according to the conventional approach to these problems, the correlation must be examined for 
ail probable combinations of DNAs and probes as before, even with DNA chips. The utilization of DNA chips as such 
thus does not provide a novel principle enabling the efficient detection of correlation for numerous combinations. 

Disclosure of the Invention 

55 

[001 0] The present inventors considered that the utilization of mixed test samples might reduce the work for detecting 
the interaction between test samples. Naturally, the random mixing of test samples is not useful for the final clarification 
. of correlation based on their interactions. By systematically mixing test samples in group A based on binary notation 
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and identifying the interactions between this mixture and test samples in group B. the present inventors have found that 
the correlation between the groups can be more efficiently, identified than prior methods accomplishing this invention. 
[001 1 ] An objective of this invention is to efficiently detect the corelation by the following method, when test samples 
(Ai) in group A con*elate to test samples (Bj) in group B based on material interactions. Namely, the present invention 
5 relates to the following method: 

[1] a method for determining a combination of test samples out of those constituting groups A and B which correlate 
physically, chemically or biologically, wherein said method comprises the following steps: 

10 (1) providing m {2""^=<m=:<2"-1 where m and n are natural numbers;, m>=3, n>s:2) test samples Ai 

(3=<i=s<m ) in group A and x (x is a natural number) test samples Bj ( 1 =<j=<x ) in group B, 

(2) assigning a g-bit (ns<g ) ID number based on the binary notation to each test sample Ai in group A, 

15 (3) mixing test samples Ai in group A having "1 " for the first bit of ID numbers based on binary notation to make 

mixture C1 . and similarly mixing test samples Ai in group A having "1 " for the k-th ( 1 =<k=<g ) bit off ID numbers 
to make mixture Ck, thus obtaining g-varieties of mixtures comprising mixtures from C1 through Cg, 

(4) detecting the interaction of each of g varieties of mixtures from C1 through Cg with test samples Bj in group 
20 B, 

(5) determining g-bit binary nunnbers having T or "0" for the k-th bit by assigning "1" when the interaction is 
detected between each mixture constituting mixtures from CI through Cg and Bj in group B. and "0" when no 
Interaction is detected, and 

25 

(6) determining the correlation between test sample Ai in group A and test sample Bi in group B by refen-ing 
test sample Ai in group A to the corresponding binary number obtained; 

[2] the method of [1]. wherein the conrelation between test samples involves the interaction between test samples 
30 constituting group A and group B; 

[3] the method of [2], wherein the con-elation based on the interaction between test samples is in the ratio off 1 : 1 
or 1 : many; 

35 [A] the method off [1], wherein g is n; 

[5| the method of [4], wherein each test samples in group A is assigned an individual ID numbers up to 2"-1 to test 
samples; 

40 [6] the method of any one of [1] through [SJ. wherein said method comprises steps for detecting tiie interaction 
between mixture Ca obtained by mixing all test samples in group A and test sample Bj in group B; 

[7] a method lor determining a combination of test samples out of those constituting groups A and B and correlating 
them physically, chemically or biologically, wherein said method comprises assigning ID numbers off more than two 
45 series to one test sample in group A ffollowed by the repetition of the method of [1]; 

[8] the method of any one of [1] through [7]. wherein said test sample in group A is an oligonucleotide and said test 
sample in group B is DNA; and 

so [9] an STS mapping method conrprising performing the method of [8], wherein said method uses STS nrrarkers as 
test samples in group A and genome libraries as test samples in group B. 

[0012] The principle of the present invention is as follows. 

55 (1) First, "m" (2"*^=<m=<2"-1 where in and n are natural numbers: m>=3, n> =2) test samples Ai (3=!<i=<m) in 
group A and "x** (x is a natural number) te st samples Bj ( 1 s<j=<x ) in group B are provided, and 

(2) each test sample Ai in group A is assigned a number based on binary notation. 
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An ID number is systematically assigned by converting the number based on decimal notation to binary nota- 
tion as shown in Table 1 (wherein p, q, r. P. Q. and R are "0** or "1 In order to assign each test sample an individual 
number based on the binary notation. 2"-1 numbers with n-bit are required. In the following, the number assigned 
to each test sample in group A may be referred to as the ID number. 



Table 1 



Test sample in group A 


Individual decimal 
number 


Individual n*bit binary 
number 


A1 


1 


• .001 


A2 


2 


• -010 


Ai 


i 


• ■pqr(.„4p+2q+r=i) 


• 


• 




Am 


m 


• •PQR(...4P+2a4-R=m) 



The mixtures are then prepared by mixing test samples in group A according to the following procedure (3). 

(3) Test samples Ai in group A having T for the first bit of the ID number based on binary notation are mixed to 
make mixture C1 . Similarly, test samples Ai in group A having "1 " for the k-th bit of ID numbers are mixed to make 
mixture Ck ( l=<k=<n ) so that n different mixtures from C1 through Cn are obtained. 

The criteria for preparing this mixture are shown in Table 2, wherein Cn - CI represent mixtures, and T indi- 
cates the addition of each test sample in group A to each mixture, and "0" indicates no addition. 



Table 2 



Test sample in group A 


Cn 


C- • • 


Ck 




C3 


C2 


CI 


A1 










0 


0 


1 


A2 




• • • 






0 


1 


0 






• « • 










• • • 


Ai 










p 


q 


r 


• 




• • • 






■ • • 


• • • 


• • • 


Am 










p 


Q 


R 



By utilizing mixtures thus obtained and performing procedures (4) to (5) below, the binary numbers by which 
test samples in group A are specified are determined. 

(4) With each of these n-varieties of mixtures from C1 through Cn, the interaction of test samples Bj in group B is 
detected, arxl 

(5) With each mixture Ck constituting mixtures from CI through Cn. Ck is assigned "1 " when the interaction of test 
sannple Bj in group B is detected, and "0** when the interaction is not detected, thus determining an n-bit binary 
number having "0" or "1 " for the k-th bit. 

Procedures for determining the binary numbers are summarized in Table 3. In this table, the presence of inter- 
action of mixtures Cn - C1 in each of the test samples constituting group B is indicated as "1 and the absence, as 
"0." Based on this table, the n-th bit represents the detection result of the interaction of each test sanpie mixture 
Cn in group A with a specific test sample in group B. 
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Table 3 



Test sample in group B 


Cn 




Ck 




C3 


C2 


C1 


Test sample in group A 


B1 


















B2 




























0 


0 


1 


A1 


Bj 


• • • 


• * • 


• • • 


• • • 


s 


t 


u 


A • • • 4s+2t+u 


• 










0 


1 


0 


A2 


Bx 




• • • 






s 


T 


U 


A* • •4S+2T+U 



Finally, procedure (6) determines the test sample in group A corresponding to a specific test sample in group B. 

(6) Ttie correlation between test sanple Ai In group A and test sample Bi in group B is determined by referring to 
the test sample Ai in group A which corresponds to the binary number obtained in (5). 

[0013] In the present invention, the physical, chemical or biological correlation between test samples constituting 
groups A and B means the relationship by which test samples In both groups interact as mediated by the physical, 
chemical or biological reaction. PreferalDly, this reaction is specific so that it can be detected only between specific test 
sarrples. Also, the correlation in interactions between test samples in group A and those in group B means that test 
samples in group A and those in group B are somehow con-elated so that their relationship can be clarified by detecting 
their interaction. F=or exantple, interactions between test samples can be represented by the binding reaction based on 
the specific affinity. More specifically, the binding reaction is exemplified by hybridization of nucleic acid, antigen-anti- 
body reaction, various ligand-receptor reactions, or enzyme-substrate reactions. 

[001 4] In addition, the interaction can be not only the binding reaction between materials but also the functional com- 
bination associated with the signal transduction. The functional combination can be exemplified by the combination trig- 
gering the transcriptional initiation or signal transduction associated with the binding of the transcriptional regulatory 
factor to the transcriptional regulatory region, or the binding of the agonist compound to the membrane receptor. In the 
present invention, the correlation between test samples in both groups is not limited to 1 : 1. ard can be 1 : many as 
shown In Rgure 1. Preferably, in this invention, the correlation of test samples in group A with those in group B is as 
close to 1 : many or 1 : 1 as possible. However, the correlation can be accurately detected even in a many : 1 relation, 
for example according to the method described below utilizing this invention. 

[001 5] The method of the present invention can be applied to any combination if the interaction between test samples 
(Ai) in group A and test samples (Bj) in group B can be detected. However, since it Is necessary to use mixtures of test 
samples in group A, the failure to detect the interaction with test samples in group B owing to mixing of test samples 
belonging to group A should be avoided. The present invention can be utilized when tiie correlation between two groups 
must be detected. More preferably, the con-elation can be detected based on the interaction between materials. This 
invention is especially useful for assaying numerous test sanrples such as in genomic analysis, high throughput screen- 
ing, combinatorial chemistry, or etc. 

[001 6] For example, group A can be oligonucleotide markers (such as STS markers. VNTR. RFLP, or microsatellite). 
and group B, DNA library clones (genomic library clones such as BAG, PAC. PI, YAC, cosmid vectors or etc.). This 
invention may thus be used for tiie binding assay for a gene using numerous transaiptional regulatory factors and ana- 
logues thereof, proteins and binding proteins, antigens and antibodies, enzymes and substrates, etc. Furthermore, tiiis 
invention can be used fa mapping cDNA and EST to genome. In particular, genome mapping, wherein numerous rep- 
etitions of screening are required and the desirable correlation between test samples in two groups (that is, 1 : 1 , or 1 : 
many) can be expected, is a useful field of application of the invention. The present inventors have designated the 
method of using this invention in genome analyses as "digital hybridization screening." 

[001 7] When there are three or more test samples (m) In group A, the use of this invention can reduce the number of 
screenings as compared with identifying the interactions with all combinations of test samples. For example, when m is 
three, the number of screenings (n) becomes two; when m is seven, n becomes three; and when in is 123, n becomes 
seven (2"'^=<m=<2"-1; m and n are natural numbers, m>=3, n>=2). According to the present invention, seven 
repeated confirmations of interaction will thus give the same results as by confirming interactions with each of tiie 123 
test samples. In this invention, screening efficiency increases with the number of test samples in group A. 
[0018] In addition, if the possibility tiiat test samples constituting group B will definitely con-espond to any of the test 
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samples in group A is previously guaranteed, there is theoretically an exceptional relation between m and n. In such a 
situation, all negative values (e.g.. 00000000) can be corresponded. Therefore, m and n are related as 
2 ""Vl=<m»<2" where m and n are natural numbers and m>3, n>«2. 

[0019] However, in some cases, having too many test samples in group A will decrease the sensitivity for detecting 
5 the correlation between test samples. The decreased sensitivity can be avoided by performing the present invention 
after dividing test samples in group A into appropriate subgroups. In some other cases, an Increasing number of test 
samples in group A will increase the possibility that the correlation becomes ''many : 1/ Increasing the correlation of 
"many : 1 " may reduce the accuracy of detecting correlation, A high possibility for a correlation of "many : V may occur 
when DNA markers that are assumed to be very closely localized to the test sample in group A are used and the test 
10 sample in group B is a DNA library such as BAG clones. In such a case. DNA markers constituting group A are first col- 
lected in one group. The present invention is then applied with this group thus formed as one of the test samples in 
group A. Collecting test samples into one group means assigning the same ID number to different test sanrples in group 
A. BAC clones which are clearly not correlated should be selected out at this stage and further subjected to a second- 
ary screening by an appropriate method. Alternatively, for analogues of chemical compounds, screening should be per- 
15 formed with a group of analogues as one test sample then subjecting the analogues to a secondary screening by a 
suitable method. The secondary screening can be performed by individually identifying the mutual relations between 
test samples. For example, in order to detect the mutual relations between BAC clones and STS by secondary screen- 
ing, hybridization using individual probes or PCR using STS primers with the BAC clone as the template Is performed 
to determine the correlation. 

20 [0020] There are, however, no particular limitations in the number of test samples (x) In group B; the efficiency 
increases with Increasing the number (x). 

[0021 ] The number assigned to each test sample Ai in group A can be made an ID specific to the sample by selecting 
a suitable number up to 2"-1 . In general, the ID number should be unique to each test sample. However, as described 
above, It Is also possltDle to collectively give the same ID number to different test samples as one group. When there are 

25 dose to 2"'^ test samples, preferably the ID numbers may be assigned randomly Instead of assigning numbers sequen- 
tially from 1 so that numbers are properly distributed among test sanples. Also, preferably the difference In the number 
of test samples Ai In group A contained in each mixture Ck may be minimized. For example, if Ai conprises 64 test sam- 
ples, only A64 (1000000) has T for the seventh bit of the ID nurt^er when numbers are assigned sequentially from 1 . 
Therefore, each mixtures from CI up to C6 comprises 32 test samples, whereas mixture C7 contain only one test sam- 

30 pie, A64 (1 000000}. This situation does not substantially affect the digital hybridization screening at all. However, equal- 
izing the numbers of test samples comprising each mixture Cn can be expected to standardize the labeling and to make 
the background level uniform. More specifically, it is possible to make the difference In the number of test samples com- 
prising each mixture Cn one or less. Even in the case of 64 test sanples above, it is possible to make the difference in 
the number of test sanples In each mixture 1 by assigning successive numbers from the 32nd to 95th based on decimal 

35 notation (from 0100000 to 101 1 1 1 1 in binary notation). By other numbering, It is also possible to distribute 64 test sam- 
ples among mixtures wherein the difference in the number of test samples in each mixture is one or less. 
[0022] In the present invention, test sample Ai in group A is assigned an ID number based on the binary notation and 
used for to prepare mixture Ck. By using the binary notation, the results of interaction detection can be directly con-e- 
lated with the numeral for each bit. In binary notation, a number Is generally written with T and "O". However. In the 

40 present Invention, a binary number can be expressedwith symbols other than "1 " and "0" since "1 " or "0" indicates the 
presence or absence of a bit in binary notation and does not limit the invention to using only "1" or ''O." 
[0023] in preparing mixture Ck ( 1 s<ks<n ) which comprises test samples "Ai^s In group A, the amount of test sample 
Ai in group A comprisfng each mixture is not particularly limited. In some cases, tor example, mixing Ai in an equal 
amount or an equimolar amount produces a homogeneous rate of interaction with test samples in group B, facilitating 

45 the easy interpretation of detection results. If there is a difference in the labeling efficiency among respective "Ai^s or in 
the rate of Interaction between groups A and B owing to the combination. It is possible to equalize the rate of interaction 
by modulating the ratio of ''Ai"s. For example, the present invention can use a mixture of probes comprising radioiso- 
tope-labeling DNA by two different methods, the kination method and the random oligomer elongation method. In such 
a case, since the intensity of signals obtained from each probe is expected to be different, an equalized signal intensity 

50 cannot be achieved by an equal amount mixing. Therefore, we can attenpt to equalize the signal intensity by first meas- 
uring the labeling efficiency and signal intensity of each probe, and, based on the result, adjusting the mixing ratio of 
probes. 

[0024] In the present invention, drugs and solvents which do not adversely affect screening may be added to mixture 
Ck, in addition to test samples "Ai"s in group A. if the test sanple in group A may adversely affect the screening by inter- 
55 action, an agent maybe added to reduce the harmful effect. 

[0025] (fixture Ca comprising all test samples in group A may be used to enhance the fidelity of the present invention. 
If results showing the interaction are not obtained due to the reaction between Ca and test sanples in group B. either 
there is no combination Which causes an interaction between them or there is a problem in detecting the interaction. 
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[0026] The interactions of n varieties of mixtures from CI through Cn with each of test samples Bj in group B can be 
detected by any methods for identifying their interaction. For example, interactions between oligonucleotides and DNAs 
can he detected using hybridization between them as the marker. 

[0027] In order to describe the present invention more understandably, a hypothetical screening of 24 clones as group 
5 B performed with six DNA markers as group A is diagrammatically represented here. These six DNA marker probes 
were specified by assigning a three-bit ID number to each probe, for example. "OOr to DNA probe 1 and "010" to DNA 
probe 2 (Table 4). 



Table 4 





M3 


M2 


M1 


ID number 


MA 


DNA probe 1 


0 


0 


1 


001 




DNA probe 2 


0 


1 


0 


010 




DNA probe 3 


0 


1 


1 


011 




DNA probe 4 


1 


0 


0 


100 




DNA probe 5 


1 


0 


1 


101 




DNA probe 6 


1 


1 


0 


110 





[00281 These DNA probes were then mixed in the combinations shown in Table 4 to prepare four probe mixtures, Ml , 
M2, M3. and MA. In the table. T indicates the presence of the marker and "0" indicates Its absence. Four probe mix- 

25 tures (MA, Ml . M2. and MS) were provided, and hybridization was candied out separately with each of these probe mix- 
tures against four identical clone filters (Figure 2). The probe mixture MA. containing all six probes, detected four clones 
(A. B, C. and D), whereas other probe mixtures detected these same clones In different combinations. The hybridization 
pattern was then examined for individual clones, for example, done A was positive for probe mixtures Ml and M2 but 
negative for M3, resulting In the three-bit pattern "01 r. This matrix pattern indicated that done A is correlated with a 

30 specific DNA probe 3 (Table 5). Similarly, the remaining three clones were correlated with specific DNA probes. 



Table 5 



Clone 


M3 


M2 


Ml 


Determined DNA probe 


Clone A 


0 


1 


1 


DNA probes 


Clone B 


1 


1 


0 


DNA probe 6 


Clone C 


1 


0 


1 


DNA probes 


Clone D 


0 


0 


1 


DNA probe 1 



[0029] In DNA library screening using DNA probes, screening fidelity can be enhanced as follows. 

(1) Use double-offset filters or two replica filters for a single series of probe mixtures to prevent erroneous results 
caused by false-positive and false-negative signals. These filters are all to yield the same detection results as the 
original filter. Accordingly, they must generate identical signal patterns for the same series of probe mixtures. Any 
difference in the signal detection pattern Indicates that either of results is erroneous. 

(2) Adding a parity bit to each ID number will standardize the number of T based on binary notation to an even 
number (Table 6). That is. test samples of group A having an odd number of T constituting the ID number are col- 
lected to provide mixture Co. and signals from test samples in group B for this mixture are simply recorded. When 
an interaction between test samples in group B and this mixture Co is detected, a parity bit of T Is added to the 
ID number. When a test samples in group B which Interact with test samples In group A originally have an even 
number of M" in the ID number, the interaction is not to be detected with said mixture Co. therefore the parity bit is 
always 0. As a result, the nun*er of 1 's in the ID number + parity bit Is always an even number with alt test samples 
in group B. One bit for the parity bit is added to seven-bit figures to determine the ID number obtained through the 
above-described procedures, and the resulting eight-bit figures are referred to as test samples in group A. . 
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Tables 



In this table, the sum of Tin ID bit + parity bit is 
made an even number. 




ID bit 


Parity bit 


Number of "r 


STS1 


0000001 


1 


2 


STS2 


0000010 


1 


2 


STS3 


000001 1 


0 


2 


STS4 


0000100 


1 


2 


STS5 


0000101 


0 


2 


STS6 


0000110 


0 


2 


STS7 
• 


0000111 


1 


4 


STS80 


• 

1011010 


• 

0 


• 

4 


STS81 


1011011 


1 


6 


• 


• 


• 


• 


STS126 


• 

1111110 


• 

0 


• 

6 



The "V in the finally decided eight-bit figures are counted. If a signal is missing or excessive for any bit. the 
number of "1" will become an odd number, indicating trouble. A hypothetical case in which the number of T con- 
stituting the ID bit and parity bit Is standardized to an even number is described. Obviously, the number of bits can 
also be standardized to an odd number. When standardized to an odd number, an even numt)er of T indicates 
trouble. Thus, the fidelity of the screening method in the present invention can be enhanced by using only one addi- 
tional series of filter. 

(3) Performing another screening with a reverse bit assigned to each ID of an STS will also rennarkatrfy enhance 
the fidelity of the screening method of the present invention (Table 7). 



Table? 





Screening 1 


Screening 2 


STS1 


0000001 


1111110 


STS2 


0000010 


1111101 


STS3 


0000011 


1111100 


STS4 


0000100 


1111011 


STS5 


0000101 


1111010 


STS6 


0000110 


1111001 


STS7 


0000111 


1111000 
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Table 7 (continued] 





Screening 1 


Screening 2 


• 

STS80 


1011010 


• 

0100101 


STS81 


1011011 


0100100 

• 


STS126 


• 

1111110 


• 

0000001 



If the screening result is precise. ID (screening 1) plus ID (screening 2) is always 1 1 1 1 1 1 1 . If any signals are 
missing. ID (screening 1) plus ID (screening 2) is always <1 1 1 1 1 1 1 . If more than one STS hybridizes with a single 
clone, ID (screening 1 ) plus ID (screening 2) is always >1 1 1 1 1 1 1 . This strategy requires two screenings. 

Therefore, the maximum advantage of performing screenings 1 and 2 is that a combination of probe and clone 
in an accurate correlation can be detected. In contrast, with the result obtained by either screening 1 or screening 
2 alone, it is Impossible to discriminate con'ect cases, incorrect cases, and cases in which more than one probe 
hybridizes with a single clone, resulting in mixed data in these cases. If most results of either screening are correct, 
two screenings are not required. However, In general, two screenings provide a significant capability to distinguish 
the correct result from the incorrect ones. 

Performing two screenings enables not only enhancing the f idelity of screening but also separating multiple sig- 
nals caused by the many: 1 correspondence. The separation of signals will be specifically described in the follow- 
ing. For exanple. a single STS probe is assigned binary ID numbers in two different series. FORWARD ID and 
REVERSE ID FORWARD ID and REVERSE ID are assigned binary numbers in an independent series, and mix- 
tures in different combinations are prepared. The congelation with clones constituting libraries will be determined 
separately based on the present invention. In such an embodiment with FORWARD ID and REVERSE ID. the fol- 
lowing natural number relation is independently established. 

2"-^o<m=<2"-1, 

where m and n are natural numbers, m>=:3. and n>=2. 

FORWARD ID and REVERSE ID for a single STS probe are assigned so that their sum always has T in all 
bits. When one probe corresponds to a certain clone, the SL-m of FORWARD ID and REVERSE ID determined for 
that clone must thus have -1 " in all bits. For example, in eight-bit IDs. the sum will become "11111111." However, 
when two STS probes hybridize with one particular clone, the sum of FORWARD ID and REVERSE ID derived from 
the hybridization contains "2- for more than one bit such as "1 1221212" for an eight-bit number. The addition here 
does not follow binary notation, but. for convenience's sake, follows the notation 1 +1 =2, 0+1 = 1 . 1 +0=1 , and 0+0=0) 
for each bit because this is more easily understood than binary notation. However, this invention is not limited to 
such an expression pattern. The expression pattern need only clearly indicate whether both FORWARD ID and 
REVERSE ID. only one of them, or neither of them are obtained for each bit. In such a case, it is possible to deduce 
which two of the STS probes hybridize with a done according to the following concept The presence of '^'s" in only 
one place, that is, the expression of "2" for only 1 bit as in "1 1 121 1 1 1 indicates the interaction with one combina- 
tion of STS probes (two probes), enabling the straight fonward identification of both STS probes. 

Theoretically, as the number of bits containing "2" increases to 2, 3, 4. 5, 6. 7. or 8. the number of STS probes 
corresponding to the target increases to 4, 8. 16, 32. 64, 128. or 254 and there are thus 2. 4, 8, 16, 32, 64. or 127 
different combinations of probe pairs corresponding to the target. That is, when "2" are expressed in 2, 3, 4, and 5 
bits, the possible correspondence of STS probes can be narrowed to 4. 8. 16. and 32 varieties (half of the combi- 
nations). If one or two probes correspond to one particular done, their con-elation can be narrowed in this way. 

When screenings 1 and 2 are performed, the signals generated by hybridization of two probes with one partic- 
ular clone can often be separated into the component signals even if the result is expressed by a single kind of sig- 
nal such as an autoradiogram based on radioisotope labeling, instead of the below-described multicolor probe. 
Rrst, each probe can be identified if there is one *'2'' as described above even when the difference in signal inten- 

V 
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sities generated by each probe is small. In addition, each probe can be identified when there is a definite difference 
between the signals, even when there are two or more *'2". For example, the sum of FORWARD ID (1 1 1 11 00) and 
REVERSE ID (11111111) becomes "22222211," indicating the hybridization of more than two probes. Here, we 
assume that these are strong and weak signals and can be distinguished. If we represent the strong and weak sig- 
nals for the bit having 2 as the sum by Si and W1 , the above results may be expressed as FORWARD ID (SI . W1 , 
81. 81. 31, W1. 0.0) and REVERSE ID (W1, SI, W1, W1. W1. SI. 1, 1). From these results, the FORWARD ID of 
the probe expressing the strong signal is 101 1 1000, and the FORWARD ID of the probe expressing the weak signal 
is 01000100. Although three or more STS probes may be hybridized, the unnecessary secondary or tertiary 
screenings may be omitted when the correlation can be confirmed using the probe having the ID number thus iso- 
lated. Furthermore, the method of using two eight-bit ID numbers is equivalent to assigning one 16-bit ID number. 
Thus, the fidelity of the method of the present invention can be enhanced by using an additional bit for the minimally 
required bits. "Minimally required bits" means the number of bits (n) needed for assigning individual ID numbers to 
test samples in group A. For this purpose, additional bits are provided by adding a desired number of bits to the 
minimally required bits to produce a g-bit ID number. 

In many cases in which two probes are assumed to correspond to one particular clone, the number of combi- 
nations of two STS probes assumed to interact can be nan'owed by using additional bits. For exanrple, if the addi- 
tion of n bits is sufficient, the number of probe combinations can be reduced to 1/4 to 1/16 of that when the 
screening is carried out with probes having an n-bit ID number when screenings 1 and 2 are performed after 
assigning an (n+2)-bit ID number to STS probes by using an additional two bits. The more bits added, the greater 
the reduction becomes. However, the necessary number of bits should be determined considering the frequency 
of correspondence of two probes to one clone since additional probe mixtures must be prepared In proportion to 
the additional bits. 

(4) Another screening with a second series of oligonucleotide probes (such as reverse primers) will decrease the 
number of false-positive signals. 

(5) Certain probes nrtay hybridize with multiple clones if they contain multicopy sequences such as long and short 
repetitive sequences. This would Interfere with determining the proper ID number. These repetitive sequences, if 
known, should be eliminated by performing a careful homology search in computer databases. In practice, a pre- 
liminary experiment should generally be performed to find and eliminate undesired STS probes prior to actual 
screening. When group A contains oligonucleotide probes having affinity to multicopy sequences such as long and 
short repetitive sequences, the identical reaction pattern for the same done may be observed among different 
probe mixtures. Therefore, the oligonudeotldes to be eliminated will be detected using this unique reaction pattern 
as a marker. In this Invention, probes causing such a non-specific reaction are designated as bad background 
oligo-probes (BBO). The concepts of steps (1) through (5) can be used not only singly but also In proper combina- 
tions, assigning various choices for enhancing the fidelity of screening according to this invention. 

[0030] The interacting combination of STS markers and genome library can be detected by reacting probe mixtures 
(group A) comprising labeled STS markers with genome libraries (group B) fixed on filters or DNA chips. Methods for 
fixing a genome library to filters are known. Although there are several tens of thousands to several hundreds of thou- 
sands of clones in each genome library, an efficient assay system can be constructed by using a high- density filter on 
which several thousand varieties of clones can be fixed. DNA chips are also useful for fixing DNA at a high density. Each 
clone of the genome library Is separately fixed in a grid on the DNA chip and reacted with the mixture of fluorescence- 
labeled probes. The grid showing a positive hybridization reaction is then determined. The treating capability Is remark- 
ably enhanced as compared with the conventional method requiring one by one reaction by reacting the probe mixture 
with a highly cumulated library, 

[0031 ] As described above, more correct results are obtained in the method for determining the con-elation according 
to the present invention, when the nunr^er of test samples in group B correlated to those In group A is in a ratio of 1 : 
1 , or 1 : many. This is because, In principle, only two kinds of data, positive T or negative "0," can be obtained when 
radioisotope labels are detected by autoradiography. When two or more probes detect one target, the con-ecl ID number 
cannot be derived owing to the overlapped signals. However, the correct conrelation will be found when a special labe- 
ling method is used as described below even with a many : 1 correspondence. This method will be described with ref- 
erence to an example for finding the congelation between STS probes and the genome library. 
[0032] For example, STS probes are assigned many varieties of labeling to generate distinguishable signals. Such 
labeling is exemplified by that with fluorescence pigments of different fluorescence wavelengths and pigments having 
different colors. When all probes are labeled so as to give the same signal, it is difficult to distinguish the correspond- 
ence if many probes correspond to a particular clone (that is, a many : 1 con-espondence). Howa/er, when many probes 
have distinguishable different signals, it is possible to clarify which co-present probe reacts, enabling the separation of 
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signals. For example, if two kinds of probes correspond to a single clone, a five-color labeling theoretically enables the 
identification of probes in 80% of the cases. In the present invention, such multilabeled probes enabling the determina- 
tion of more correct correlation is designated multicolor probes. 

[0033] It is also possible to minimize the number of ID numbers by using a multicolor probe because even though the 
identical ID number is assigned to different test samples, they can be distinguished by the difference in labeling. Using 
this characteristic, it is theoretically possible to reduce the number of varieties of mixtures (the number of reactions for 
establishing the correspondence) in proportion to the number of varieties of labeling. For example, when the identical 
ID nunfUser is assigned to five different probes using five varieties of labels, the number of mixtures will be reduced to 
1/5. 

[0034] By using the method for determining the interacting combination of the present invention, rt is possible to effi- 
ciently perform not only genome analysis but also the saeening of transcriptional regulatory agents and agonist com- 
pounds for the membrane receptor. These applications will be specifically described In the following. 
[0035] The present invention can be applied to screening transcriptional regulators corresponding to target genes. 
The present invention allows simultaneous screening of the activity of candidate compounds in the transcriptional reg- 
ulatory region corresponding to each of numerous target genes not to a single target. Transcriptional evaluation plas- 
mids are prepared by inserting the structure formed by replacing the coding region of each gene with a reporter gene 
and linking it to the transcriptional regulatory region localized upstream of its 5'-side. The transcriptional regulatory 
activity is then assayed by successively contacting said transfornrtants with candidate conpound mixtures. By trans- 
forming each transformant with many varieties of plasmid, conditions equivalent to those con-esponding to the mixtures 
of test samples in group A can be constructed. By just screening each candidate compound using a few transformants, 
the correlation of candidate compounds having transcriptional regulatory activity with the transcriptional regulatory 
regions as their targets can be determined. 

[0036] The present Invention also enables the efficient screening of agonist compounds for membrane receptors with 
unknown functions as a part of the functional analysis of genes. The transalptional signal at the final stage of the intra- 
cellular signal transduction is utilized for this screening. A considerable portion of the final transcriptional signal gener- 
ated by membrane receptors Is thought to be classified Into the terminal cAMP responsive element (CRE) of the cAMP 
signal, or API , one of the tenminal elements of the Ca signal. Cultured cells are co-transformed by transferring a trans- 
lational evaluation plasntid in which a reporter gene is linked downstream of CRE or API and another plasmid express- 
ing membrane recepta with unknown functions. The agonist activity of candidate compounds can be determined by 
contacting candidate agonist compounds with this transformant to detect the expression of the reporter activity. In this 
case, a process equivalent to that for preparing test sample mixtures In group A of the present Invention can be 
achieved by transferring many membrane receptor genes to the same cell. The correlation between candidate com- 
pounds having agonist activity and their targeting membrane receptors can be found by just screening each candidate 
conpound using a few transformants. In this method, the correlation between receptors and agonists can be detected 
even with membrane receptors having unknown functions, provided they trigger the signal transduction mediated by 
CRE and API. 

[0037] Epitope analysis of monoclonal antibodies can also be performed based on the present invention. Antigen frag- 
ments are first prepared as test samples In group A. For example, in the case of proteinaceous antigens, oligopeptide 
libraries corrprising amino add sequences shifted by several amino adds each from the terminus are synthesized and 
assigned ID numbers. Epitope analysis can be performed by identifying the correlation of these libraries with mono- 
clonal antibodies as test samples In group B. For some macromolecular antigens, the con-espondence to several hun- 
dreds of oligonudeotides might have to be examined. However, the application of the present invention enables 
clarifying the correlation with only a few assays. 

Brief Descripti on of the Drawings 
[0038] 

Figure 1 depicts the correlation model of test samples in group A with those in group B. 
Rgure 2 depicts the hybridization pattern model when the screening method of the present invention was per- 
formed using six DNA markers (group A) and 24 clones (group B). Four DNA probe mixtures (MA, M1 . M2, and M3) 
were provided, and each probe mixture was separately hybridized with four identical replica filters containing 24 
clones each. The probe mixture MA. containing all six probes, detected four clones (A. B. C, and D), whereas other 
probe mixtures detected these same clones in different combinations. 

Best Mode for Imolemen tinQ the Invention 

[0039] The present invention will be described more specifically with reference to examples, but is not to be construed 
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to be limited thereto. 
Example 1 

5 [0040] Screening with the present invention was performed using 15.360 BAC clones [Gene 191 (1997, p69-79] and 
126 STS markers (Tables 8 to 10). Each of the nucleotide sequences of STS markers and primers is available from 
dbSTS (http7/www.ncbi.nlm.nih.gov7dbSTS/) provided by NCBI and DDBJ Accession No. C75685-C75936. In this 
case, a seven-bit ID number was assigned to each of the 126 probes as shown in Tables 8 to 10. Eight mixtures of STS 
marker probes (FA, F1 through F7) were provided. The mixture FA contained ail 126 STS probes, and other mixtures 

10 contained 63 STS probes in arranged combinations. 

[0041] To a reaction mixture (50 containing 0.1 pmol each of 63 (or 126) STS marker oligonucleotides (fonvard 
anrplimers) were added 31 .5 nCi (or 63 jiCi) of [y^^PJATP (5,000 Ci/mmol, AA001 8, Amersham). and one (or two) units 
of T4 polynucleotide kinase (Takara). The mixture was then incubated at 37°C for 30 min. The unincorporated [y - 
^2p]ATP was removed with a Sephadex G-50 spun-column. Approximately 50% of the ^^P was incorporated. Prehybrid- 

15 ization was performed at 55*'C for 1 2 h with a solution of 5 x SSC, 5 x Denhardf s, 0.5% SDS containing 0. 1 mg/ml dena- 
tured salmon testes DNA (Sigma, D3159; average 200 bp), and 0.2 mg/ml denatured herring sperm DNA (Sigma 
D1626) (average <100 bp) which were fragmented by sonication. After the prehybridlzation, the labeled probes were 
added to the reaction mixture and incubated at 55**C for 72 h. The final concentration of each probe was 0.005 pmol/mt. 
The filters were washed three times with 2 x SSC and 0.5% SDS for 10 min at room terrperature, followed by two wash- 

20 ings for 30 min at 60**C, and finally washed with 0. 1 x SSC and 0.1% SDS for 10 min at room temperature. The filters 
were exposed against Fuji imaging plates for 72 h. The autoradiograms were obtained with a BAS 2000 Bio Imaging 
analyzer (Fuji Photo Rim). 

[0042] Each of these probe mixtures labeled with ^^P by the above-described method was separately hybridized 
against five sets of high-density replica (HDR) filters, on which 3072 BAC clones were blotted. Hybridization signals 

25 were clearly observed in autoradiograms. The probe mixture FA containing all 126 probes detected 30 BAC clones on 
this particular HDR filter (a total of 104 clones for five separate HDR filters), whereas the seven other probe mixtures 
detected these same clones in different combinations. The hybridization signal patterns were examined by the method 
disclosed in the present specification. For example, BAC 66 was positive for probe mixtures F7, F5, F4, and F1 , but neg- 
ative for F6 and F3, resulting in the seven-bit pattern "1011011" (Tables 1 1 to 1 2). As a result, the con-elation between 

30 BAC 66 and a specific STS probe, Dl 1 Si 308 (Tables 8 to 1 0), was determined. In the same way. the seven-bit numbers 
of the remaining clones were determined, and the correlation was established with 48 specific STS probes. Among the 
48 STS probes, two probes (STS7 and STS23) detected more clones than expected, a total of 23 BAC clones (eight 
and 15 clones, respectively). This is probably because these probes contain multicopy repetitive sequences, although 
these oligonucleotide probes were designed to be as unique as possible to prevent false-positive signals upon hybridi- 

35 zation. In order to confirm the fidelity of the digital hybridization screening, the PCR analysis was performed, eliminating 
these 23 BAC clones. The method of PCR analysis will be described below. 

[0043] PCR analysts was performed for 81 clones using forward and reverse amplimers designed for each of 46 STS 
marker probes selected by the above-described method. The results of PCR analysis together with DH screening data 
are shown in Tables 1 1 and 12. The PCR analysis indicated that 76 BAC clones were analyzable and five clones were 
40 not Among these 76 BAC clones. 59 clones were confirmed to con-elate with 37 specific STS markers. These results 
clearly Indicate the high success rate (59/76=78%) for detecting the con-elation between clones and STS markers. 
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Table 8 (continued) 
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Table 9 (continued) 
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0 




103 


D11S1324 








0 


0 


1 


1 


1 




104 


D11S1325 








0 




0 


0 


0 


45 


105 


D11S1326 








0 




0 


0 


V 




106 


D11S1327 








0 




0 


1 


0 




107 


D11S1328 








0 




0 


1 


1 


50 


108 


D11S1329 








0 




1 


0 


0 




109 


D11S1330 








0 




1 


0 


1 




110 


D11S1331 








0 




1 


1 


0 




111 


D11S1332 








0 




1 


1 


1 


55 


112 


D11S1333 








1 


0 


0 


0 


0 




113 


D11S1335 








1 


0 


0 


0 


1 
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Table 10 (continued) 





STS 


FA 


F7 


F6 


F5 


F4 


F3 


F2 


F1 


114 


D11S1336 










0 


0 


1 


0 


115 


D11S1337 










0 


0 


1 


1 


116 


D11S1339 










0 


1 


0 


0 


117 


D11S1340 










0 


1 


0 


1 


118 


D11S1341 










0 


1 


1 


0 


119 


D11S1342 










0 




1 


1 


120 


E-22 












0 


0 


0 


121 


E-68 












0 


0 


1 


122 


E-86 












0 


1 


0 


123 


E-161 












0 


1 


1 


124 


E-218 












1 


0 


0 


125 


E-836 












1 


0 


i 


126 


E-877 












1 


1 


0 



Table 11 



done 


DH result 


STS 


PGR 


1 


0011000 


24 


P 


2 


1111010 


122 


P 


3 


0010001 


17 


P 


4 


1011011 


91 


P 


5 


1011101 


93 


P 


6 


0001010 


10 


P 


7 


0101010 


42 


P 


8 


0011110 


30 


P 


9 


1001110 


78 


P 


10 


1111010 


122 


N 


11 


1001110 


78 


N 


12 


1001111 


79 




13 


0111010 


58 




14 


1111010 


122 


P 


15 


0110001 


49 


P 


16 


1111001 


121 


P 


17 


1100101 


101 


P 


18 


0111000 


56 


P 


19 


0000011 


3 


P 


20 


1101110 


110 


P 
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Table 11 (continued) 



10 



IS 



20 



25 



30 



clone 


DH result 


STS 


PGR 


21 


1011010 


90 


P 


22 


0111111 


63 


P 


23 


1101101 


109 


P 


24 


1011111 


95 


P 


25 


1101111 


111 


P 


26 


1000011 


67 


P 


27 


1101111 


111 


N 


28 


0111111 


63 


N 


29 


0000011 


3 


N 


30 


1000010 


66 


N 


31 


0000101 


5 


- 


32 


0000101 


5 


- 


33 


0011111 


31 


- 


34 


1111110 


126 


P 


35 


1000111 


71 


P 


36 


0010110 


22 


P 


37 


1000011 


67 


P 


38 


1111100 


124 


P 


39 


1011011 


91 


P 


40 


0111011 


59 


P 


41 


0000100 


4 


P 



35 



Table 12 





clone 


DH result 


STS 


PGR 


40 


42 


1000100 


68 


P 




43 


1010101 


85 


P 




44 


1101111 


111 


N 


45 


45 


1111001 


121 


N 




46 


1010111 


87 


N 




47 


0011011 


27 


P 




48 


0111000 


56 


P 


50 


49 


1110000 


112 


P 




50 


1110000 


112 


P 




51 


0101010 


42 


P 


55 


52 


1101111 


111 


P 




53 


0110001 


49 


P 




54 


0110110 


54 


N 



17 



EP0 976 836 A1 

Table 12 (continued) 





clone 


DH result 


STS 


PGR 




55 


1011101 


93 


N 


5 


56 


1111111 








57 


0010000 


16 


P 




58 


0011011 


27 


p 


10 


59 


1000111 


71 


p 


60 


0000110 


6 


p 




61 


1011101 


93 


p 




62 


1111110 


126 


p 


15 


63 


001 1110 


'^0 


p 




64 


1100100 


100 


p 




65 


1011111 


95 


p 


20 


66 


1011011 


Q1 


P 




67 


1011(M1 


fiQ 

09 


p 




DO 


1101011 


107 


p 

r 




AO 


UU MUM 


C.I 


P 

r 


25 


7f\ 
fV 


UUIUI lU 


OO 


D 

r 




71 
f 1 


U 1 UUU 1 1 




p 

r 




72 


000001 1 


*3 

o 


p 


30 


73 


1000100 

1 UUU lUU 


OO 


p 




74 


1101101 


10Q 

1 U9 


p 

r 




75 


1000111 


71 


P 




76 


1101111 


111 


P 


35 


77 


1111010 


122 


N 




78 


1010111 


87 


N 




79 


0101101 


45 


N 


40 


80 


1010111 


87 


N 




81 


1101010 


106 


N 




P:PCR positive N:PCR negative -:not 
informative 



A5 

ExamplQ 2 



[0044] Oligonucleotide probes conresponding to human genomic repetitive sequences were detected in the five-bit 
digital hybridization method. Oligonucleotide probes having affinity for the repetitive sequences tend to detect extremely 
50 large numbers of clones besides the desired ones, complicating the analysis and interfering with the con'ect interpreta- 
tion of results due to the many : 1 con-elation between groups A and B. Therefore, the following example will describe 
how such undesirable oligonucleotide probes can be selected and eliminated as a more preferable embodiment of the 
present invention. 

[0045] The comprehensive isolation of BAC clones on chromosome 8 was attenrpted using the digital hybridization 
55 screening method with 101 4 expressed sequence tags (ESTs) mapped on chromosome 8 as probes. Either partner of 
a pair of oligonucleotide primers (fonn^ard and reverse) for PGR amplification of each EST marker was used as a probe. 
Nucleotide sequences of primers were selected from those registered in databases such as GDB. These nucleotide 
sequences should ideally be designed based on a single-copy sequence. In practice, however, they may often contain 
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oligonucteotides derived from human genomic repetitive sequences. This results in the undesirable hybridization with 
BAG clones not oonresponding to the original ESTs. In particular, when the frequency of the repetitive sequence from 
which the said oligonucleotide is derived is high on the genome, far more clones than desired will be selected as posi- 
tive clones. Such a situation may greatly impede the performance of digital hybridization screening on a large scale, 
5 Therefore, digital hybridization screening was performed on a small scale before comprehensive screening was per- 
formed on a large scale. 

[0M6] First, 33 probe mixtures were prepared by dividing 1 01 4 ESTs into portions of 31 ESTs each. Each probe mix- 
ture was labeled with ^^P in a similar manner as desaibed in Example 1 . Colony hybridization was then performed 
using the above-described labeled probe mixtures against filters blotted with 61 14 BAG clones (equivalent to approxi- 

TO mately 0.2 genome). Screening the number of clones equivalent to 0.2 genome with 31 probes is theoretically expected 
to generate about six positive signals on average. However, screening with probe mixtures containing oligonucleotide 
probes derived from the repetitive sequences detected several tens to several hundreds of positive clones. Out of 33 
sets of probe mixtures, five were found to contain oligonucleotide probes presumably derived from repetitive 
sequences. Therefore, the 31 oligonucleotide probes comprising these sets were specified by assigning a five-bit ID 

15 number to each probe, and the digital hybridization was performed. Partial results are shown In Table 13. 



Table 13 



20 



25 



40 



SO 





\j\ lyii idt Mu. 


namo 
1 lai 1 lo 


Ml 


M2 


M3 


M4 


M5 


1 

1 


447 


RlSn279S 

iJtOVJC / CO 


0 


0 


0 


0 


1 


2 


450 


StSG2865 


0 


0 


0 


1 


0 


o 
o 






n 

\f 




0 


1 


1 


4 


492 


StSG48a7 


0 


0 


1 


0 


0 


5 


505 


StSG8247 


0 


0 


1 


0 


1 


A 
u 


521 


stSGaais 


0 


0 


1 


1 


0 


7 






0 


0 


1 


1 


1 


8 


584 


TIGR-A004Q12 


0 




0 


0 


0 


9 


661 


TIGR-A008B17 


0 




0 


0 


1 


10 


664 


TIGR-A008N08 


0 




0 


1 


0 


11 


673 


TIGR-A008X11 


0 




0 


1 


1 


12 


675 


TIGR-A008Z29 


0 




1 


0 


0 


13 


676 


TIGR-A008Z39 


0 




1 


0 


1 


14 


705 


WI-11650 


0 




1 


1 


0 


15 


711 


WI-11745 


0 




1 


1 


1 


16 


715 


WM1819 




0 


0 


0 


0 


17 


717 


WM1882 




0 


0 


0 


1 


18 


721 


WI-12131 




0 


0 


1 


0- 


19 


725 


WM243 




0 


0 


1 


1 


20 


727 


WI-12556 




0 


1 


0 


0 


21 


740 


WM310 




0 


1 


0 


1 


22 


759 


WM4143 




0 


1 


1 


0 


23 


762 


WM4186 




0 


1 


1 


1 


24 


778 


WI-15041 




1 


0 


0 


0 


25 


786 


Wl-15306 




1 


0 


0 


1 


26 


797 


WI-15914 




1 


0 


1 


0 
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Table 13 (continued) 





Original No. 


name 


M1 


M2 


M3 


M4 


M5 


27 


801 


WI-16152 


1 


1 


0 


1 


1 


28 


802 


WI-16182 


1 


1 


1 


0 


0 


29 


810 


WI-16695 


1 


1 


1 


0 


1 


30 


816 


Wl-16842 


1 


1 


1 


1 


0 


31 


840 


WI-17906 


1 


1 


1 


1 


1 



[0047] Table 13 shows five-bit ID numbers assigned to 31 probes in the 19th set which contained oligonucleotide 
probes presumably derived from the repetitive sequences. Probe mixtures M1 to MS were prepared based on Table 13. 
Using each of these probe mixtures, colony hybridization was performed against identical filters blotted with each done. 
Numerous clones were detected in the same pattern with M2 and M5. Therefore, the probe 661 (TGGR-A008B17) 
assigned ID number "OlOOr was determined to be derived from the repetitive sequences. TTie reaction pattern was 
then examined using probe mixtures from which only 661 {TGGR-A008B1 7) was eliminated. Elimination of the oligonu- 
cleotide probe which was judged to be derived from the repetitive sequences and to hybridize with many clones resulted 
in clearer patterns. 

[0048] In a similar manner, oligonucleotide probes contained in other probe mixtures and presumably derived from 
the repetitive sequences were also detected. Finally, out of 1014 ESTs, five oligonucleotide probes derived from the 
repetitive sequences were detected and eliminated. The renraining 1009 ESTs were used for digital hybridization 
screening on a large scale. 

[0049] Furthermore. 377 STS probes, from which oligonucleotide probes derived from the repetitive sequences were 
eliminated by a similar five-bit digital hybridization method as that for EST. were used to saeen BAC clones (100.000 
clones) on chromosome 7 too. A total of 1 ,386 STS probes and EST probes was divided into groups of 495, 490, and 
401 to form three sets of probe mixtures. 

[0050] With large-scale screening using combinations of EST probes and STS probes, about 4.300 out of 100,000 
BAC clones could be rapidly screened as positive. 

[0051] Next, these 4,300 clones were subcloned to generate sublibraries, and a more detailed correlation between 
each probe and the BAC libraries was made by performing an eight-bit digital hytiridization screening. Some 1 187 vari- 
eties of STS probes or EST probes were used for the digital hybridization. 

[0052] Oligonucleotides designed based on the 1 , 1 87 different STS or EST sequences as probes were divided into 
five mixtures (comprising 236 to 238 probes). With these probe mixtures, eight-bit digital hybridization screening was 
performed against a sublibrary comprising about 4,300 BAC clones derived from chromosome 8 and previously 
screened. Each probe was assigned a FORWARD ID and a REVERSE ID, that is, specified by assigning two series of 
ID numbers such that the sum of FORWARD ID and REVERSE ID becomes (11111111). Therefore, each screening 
was performed using eight probe mixtures for FORWARD, eight probe mixtures for REVERSE, and one mixture con- 
taining all probes, a total of 17 probe mixtures. If the sum of ID numbers derived from results oNatned by both FOR- 
WARD and REVERSE mixtures is (11111111), one particular kind of probe hybridized with the BAC clone and the 
correct result was obtained for the individual done. After five screenings, one or more BAC clones could be correlated 
to the total of 670 varieties of STS (or EST),. It was presumed that about 70 varieties of STSs (or ESTs) have STSs or 
ESTs extremely close to each other, and two or more kinds of STS probes (or EST probes) hybridize with one particular 
BAC clone. As to the remaining 450 kinds of STS (or EST), it was assumed that no corresponding BAC done was 
present in the sublibrary 

[0053] Thirty man-days was required to screen with 670 varieties of STS (or EST) probes. SucTi a speedy identifica- 
tion of BAC clones corresponding to 670 varieties of STS probes is impossible for the conventional method wherein cor- 
relations are determined one after another for all combinations of probes and clones. 

Example 3 

[0054] Application of the present invention to screening transcriptional regulatory agents will be more specifically 
described with reference to screening agonist compounds against transcription factors belonging to the steroid receptor 
superfamily. 

[0055] The following 1 0 varieties of transcription factors belonging to the steroid receptor superfamily are chosen: 
glucocorticoid receptor a, 
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progesterone receptor a, 
androgen receptor a, 
5 estrogen receptor a. 

retinoic acid receptor a1 . 
retinoid X receptor a. 

10 

thyroid hormone receptor a1 , 

vitamin D3 receptor, 

15 peroxisome prdiferator activated receptor yl , and 

hepatocyte nuclear factor 4-a1 . 

[0056] The correlation between transcription factors corresponding to these receptors and agonist compounds will be 
20 determined using the present invention. CV-1 cells are used as cultured cells to be transformed. CV-1 cells cultured in 
a DMEM medium supplemented with 1 0% fetal bovine serum were washed twice with PBS(-), trypsin/EDTA was added 
to them, and cells were incubated at 37**C for 5 min. Cells were then suspended in the added medium and centrifuged. 
Sedimented cells are resuspended in the added medium so that the cell density becomes 2x10* cells/2.5 ml. Cells are 
inoculated 2.5 mi/well onto six-well plates and cultured in a CO2 incubator (37°C, 5% CO2) for 18 to 24 h. 
25 [0057] CV-1 cells thus seeded are transformed by transcriptional evaluation plasmids. Transcriptional evaluation plas- 
mids have a structure wherein the luciferase gene is linked downstream of each transcription factor. In this example, 
plasmids harboring the above-described 10 varieties of transcription factors are mixed as shown in Table 14 and used 
for transformation. In this table, the line for each plasmid has a four-bit ID number, F4 to F1, and the F4 to F1 column 
indicates the presence ("1^ or absence ("0") of plasmid in each mixture. Mixture FA contains all plasmids. When the 
30 use of this mixture fails to detect signals indicating the correlation between plasmid and candidate compounds as the 
transcriptional regulatory factor, the transcriptional regulatory activity may not exist for all compounds tested or opera- 
tional procedures may have gone awry. 



Table 14 





Transcription factor 


FA 


F4 


F3 


F2 


F1 


1 


Glucocorticoid receptor a 




0 


0 


0 


1 


2 


Progesterone receptor a 




0 


0 


1 


0 


3 


Androgen receptor a 




0 


0 


1 


1 


4 


Estrogen receptor a 




0 


1 


0 


0 


5 


Retinoic acid receptor a1 




0 


1 


0 


1 


6 


Retinoid X receptor a 




0 


1 


1 


0 


7 


Thyroid hormone receptor a1 




0 


1 


1 


1 


8 


Vitamin D3 receptor 




1 


0 


0 


0 


9 


Peroxisome poliferator activated receptor yl 




1 


0 


1 


0 


10 


Hepatocyte nuclear factor 4-a1 




1 


0 


1 


1 



[0058] Each plasmid is used in the amount of 0.3 ^g diluted in 150 m1 of Opti-MEM. Upofect AMINE reagent (7,0 
is added to a solution containing this evaluation plasmid, and the mixture is incubated at room temperature for 15 to 45 
55 min. After incubation, the plasmid solution is further washed twice with PBS {-) (1 ml), and then added to CV-1 cells. 
After the cells are incubated in a CO2 incubator for 4 h, the transformation solution is removed, trypsin/EDTA (200 ^l) 
is added, and the cell suspension is allowed to stand at 37'»C for 5 min. After DMEM medium (containing no phenol red) 
supplemented with 10% active charcoal-treated fetal calf serum (1 ml) is added, cells are transferred into a centrifuge 
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tube. Cells are recovered by centrifugation at 1.000 rpm for 3 min, resuspended so that the ceil density becomes 1.6 x 
lO'* ceIls/50 ^1. and inoculated 50 fil/weil onto a 96-well plate. After incubation in a CO2 incubator for 1 h or longer, a 
solution of a candidate compound at a proper concentration is added 50 ^l/weil onto a 96-well plate. After the addition, 
cells are cultured in a CO2 incubator tor 40 to 48 h. The transcriptional regulatory activity is then confirmed by lucif erase 
5 assay in the following procedures. 

[0059] After the culture medium is removed from the 96-well plate, wells are washed with PBS(-) (100 ^1). To the 
washed cells was added 1 x Passive lysis buffer (Promega) (20 nl), and the cell suspension was incubated at room tem- 
perature for more than 15 min to lyze celts. The cell lysis solution thus obtained as the test sample was made lumines- 
cent, and the luminescence intensity was measured with a luminometer such as MLX (DYNAX Technologies). 

10 

Industrial Applicability 

[0060] In the present invention, when test samples in group A and those in group B are correlated so as to interact, 
a method for determining said correlation is provided. Reaction procedures for screening can be remarkably reduced 
15 in the present invention by utilizing test samples in group A mixed according to the principle of binary notation. Theo- 
retically, the number of screenings decreases when there are seven or more test samples in group A. and as the 
number of test samples in group A Increases, the reduction becomes greater. 

[0061] The field of application of the method for determining the correlation based on the present invention is exem- 
plified by mapping of STS and EST to a genomic library. These procedures are just the procedures for determining the 
20 conrelation of group A (i.e., STS) and group B (i.e., genome library) based on the hybridization of both groups. Further- 
more, because of the enormously large scale of both groups, the effect of the present invention to reduce the number 
of screenings is highly promising. Through the increased efficiency of screening, the present invention will greatly con- 
tritxjte to promoting large-scale genome analysis including the Human Qenome Project. 

25 Claims 

1 . A method for determining a combination of test samples out of those constituting groups A and B which correlate 
physically, chemically or biologically, wherein said method comprises the following steps: 

30 (1) providing m (2n'^=<m«<2"-1 where m and n are integers:. m>3, n>2) test sannples Ai (3=<i=<m) in 

group A and x (x is an Integer) test samples Bi ( 1=<j=<x) in group B, 

(2) assigning a g-bit (n=><g) ID number based on the binary notation to each test sample Ai in group A. 

35 (3) mixing test samples Ai in group A having "1 for the first bit of ID numbers based on binary notation to make 

mixture CI , and similarly mixing test samples Ai In group A having "1 " for the k-th ( 1 =<k=<g ) bit of ID numbers 
to make mixture CK thus obtaining g-varietles of mixtures comprising mixtures from CI through Cg, 

(4) detecting the interaction of each of g varieties of mixtures from Cl through Cg with test samples Bj in group 
40 B. 

(5) determining g-bit binary numbers having T or "O" for the k-th bit by assigning "1" when the interaction is 
detected between each mixture constituting mixtures from CI through Cg and Bj in group B. and -O" when no 
interaction is detected, and 

45 

(6) determining the correlation between test sanrple Ai in group A and test sample Bi in group B by referring 
test sample At in group A to the corresponding binary number obtained. 

2. The method of claim 1 , wherein the correlation between test samples involves the interaction between test samples 
so constituting groups A and group B. 

3. The method of claim 2, wherein the correlation based on the interaction between test samples is in the ratio of 1 : 
1 or 1 : many. 

55 4. The method of claim 1 , wherein g is n. 

5. The method of claim 4. wherein each test samples in group A is assigned an individual ID numbers up to 2"-1 to 
test samples. 
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6. The method of any one of claims 1 through 5, wherein said method comprises steps for detecting the interaction 
between mixture Ca obtained by mixing all test samples in group A and test sample Bj In group B. 

7. A method for determining a combination of test samples out of those constituting groups A and B and correlating 
5 them physically, chemically or biologically, wherein said method comprises assigning ID numbers of more than two 

series to one test sample in group A. and performing steps (3) to (6) of the method of claim 1 for each series of the 
ID numbers. 

8. The method of any one of claims 1 through 7, wherein said test sample in group A is an oligonucleotide and said 
10 test sample in group B is DNA 

9. An STS mapping method comprising performing the method of claim 8. wherein said method uses STS markers 
as test samples in group A and genome libraries as test samples in group B. 

IS 
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